24 research outputs found
Modeling Spoken Information Queries for Virtual Assistants: Open Problems, Challenges and Opportunities
Virtual assistants are becoming increasingly important speech-driven
Information Retrieval platforms that assist users with various tasks.
We discuss open problems and challenges with respect to modeling spoken
information queries for virtual assistants, and list opportunities where
Information Retrieval methods and research can be applied to improve the
quality of virtual assistant speech recognition.
We discuss how query domain classification, knowledge graphs and user
interaction data, and query personalization can be helpful to improve the
accurate recognition of spoken information domain queries. Finally, we also
provide a brief overview of current problems and challenges in speech
recognition.Comment: SIGIR '23. The 46th International ACM SIGIR Conference on Research &
Development in Information Retrieva
Semantic Entity Retrieval Toolkit
Unsupervised learning of low-dimensional, semantic representations of words
and entities has recently gained attention. In this paper we describe the
Semantic Entity Retrieval Toolkit (SERT) that provides implementations of our
previously published entity representation models. The toolkit provides a
unified interface to different representation learning algorithms, fine-grained
parsing configuration and can be used transparently with GPUs. In addition,
users can easily modify existing models or implement their own models in the
framework. After model training, SERT can be used to rank entities according to
a textual query and extract the learned entity/word representation for use in
downstream algorithms, such as clustering or recommendation.Comment: SIGIR 2017 Workshop on Neural Information Retrieval (Neu-IR'17). 201
Pyndri: a Python Interface to the Indri Search Engine
We introduce pyndri, a Python interface to the Indri search engine. Pyndri
allows to access Indri indexes from Python at two levels: (1) dictionary and
tokenized document collection, (2) evaluating queries on the index. We hope
that with the release of pyndri, we will stimulate reproducible, open and
fast-paced IR research.Comment: ECIR2017. Proceedings of the 39th European Conference on Information
Retrieval. 2017. The final publication will be available at Springe
Lexical Query Modeling in Session Search
Lexical query modeling has been the leading paradigm for session search. In
this paper, we analyze TREC session query logs and compare the performance of
different lexical matching approaches for session search. Naive methods based
on term frequency weighing perform on par with specialized session models. In
addition, we investigate the viability of lexical query models in the setting
of session search. We give important insights into the potential and
limitations of lexical query modeling for session search and propose future
directions for the field of session search.Comment: ICTIR2016, Proceedings of the 2nd ACM International Conference on the
Theory of Information Retrieval. 201
Neural Vector Spaces for Unsupervised Information Retrieval
We propose the Neural Vector Space Model (NVSM), a method that learns
representations of documents in an unsupervised manner for news article
retrieval. In the NVSM paradigm, we learn low-dimensional representations of
words and documents from scratch using gradient descent and rank documents
according to their similarity with query representations that are composed from
word representations. We show that NVSM performs better at document ranking
than existing latent semantic vector space methods. The addition of NVSM to a
mixture of lexical language models and a state-of-the-art baseline vector space
model yields a statistically significant increase in retrieval effectiveness.
Consequently, NVSM adds a complementary relevance signal. Next to semantic
matching, we find that NVSM performs well in cases where lexical matching is
needed.
NVSM learns a notion of term specificity directly from the document
collection without feature engineering. We also show that NVSM learns
regularities related to Luhn significance. Finally, we give advice on how to
deploy NVSM in situations where model selection (e.g., cross-validation) is
infeasible. We find that an unsupervised ensemble of multiple models trained
with different hyperparameter values performs better than a single
cross-validated model. Therefore, NVSM can safely be used for ranking documents
without supervised relevance judgments.Comment: TOIS 201
Structural Regularities in Text-based Entity Vector Spaces
Entity retrieval is the task of finding entities such as people or products
in response to a query, based solely on the textual documents they are
associated with. Recent semantic entity retrieval algorithms represent queries
and experts in finite-dimensional vector spaces, where both are constructed
from text sequences.
We investigate entity vector spaces and the degree to which they capture
structural regularities. Such vector spaces are constructed in an unsupervised
manner without explicit information about structural aspects. For concreteness,
we address these questions for a specific type of entity: experts in the
context of expert finding. We discover how clusterings of experts correspond to
committees in organizations, the ability of expert representations to encode
the co-author graph, and the degree to which they encode academic rank. We
compare latent, continuous representations created using methods based on
distributional semantics (LSI), topic models (LDA) and neural networks
(word2vec, doc2vec, SERT). Vector spaces created using neural methods, such as
doc2vec and SERT, systematically perform better at clustering than LSI, LDA and
word2vec. When it comes to encoding entity relations, SERT performs best.Comment: ICTIR2017. Proceedings of the 3rd ACM International Conference on the
Theory of Information Retrieval. 201
Reply With: Proactive Recommendation of Email Attachments
Email responses often contain items-such as a file or a hyperlink to an
external document-that are attached to or included inline in the body of the
message. Analysis of an enterprise email corpus reveals that 35% of the time
when users include these items as part of their response, the attachable item
is already present in their inbox or sent folder. A modern email client can
proactively retrieve relevant attachable items from the user's past emails
based on the context of the current conversation, and recommend them for
inclusion, to reduce the time and effort involved in composing the response. In
this paper, we propose a weakly supervised learning framework for recommending
attachable items to the user. As email search systems are commonly available,
we constrain the recommendation task to formulating effective search queries
from the context of the conversations. The query is submitted to an existing IR
system to retrieve relevant items for attachment. We also present a novel
strategy for generating labels from an email corpus---without the need for
manual annotations---that can be used to train and evaluate the query
formulation model. In addition, we describe a deep convolutional neural network
that demonstrates satisfactory performance on this query formulation task when
evaluated on the publicly available Avocado dataset and a proprietary dataset
of internal emails obtained through an employee participation program.Comment: CIKM2017. Proceedings of the 26th ACM International Conference on
Information and Knowledge Management. 201
Multimodal Classification of Urban Micro-Events
In this paper we seek methods to effectively detect urban micro-events. Urban
micro-events are events which occur in cities, have limited geographical
coverage and typically affect only a small group of citizens. Because of their
scale these are difficult to identify in most data sources. However, by using
citizen sensing to gather data, detecting them becomes feasible. The data
gathered by citizen sensing is often multimodal and, as a consequence, the
information required to detect urban micro-events is distributed over multiple
modalities. This makes it essential to have a classifier capable of combining
them. In this paper we explore several methods of creating such a classifier,
including early, late, hybrid fusion and representation learning using
multimodal graphs. We evaluate performance on a real world dataset obtained
from a live citizen reporting system. We show that a multimodal approach yields
higher performance than unimodal alternatives. Furthermore, we demonstrate that
our hybrid combination of early and late fusion with multimodal embeddings
performs best in classification of urban micro-events
Server-side Rescoring of Spoken Entity-centric Knowledge Queries for Virtual Assistants
On-device Virtual Assistants (VAs) powered by Automatic Speech Recognition
(ASR) require effective knowledge integration for the challenging entity-rich
query recognition. In this paper, we conduct an empirical study of modeling
strategies for server-side rescoring of spoken information domain queries using
various categories of Language Models (LMs) (N-gram word LMs, sub-word neural
LMs). We investigate the combination of on-device and server-side signals, and
demonstrate significant WER improvements of 23%-35% on various entity-centric
query subpopulations by integrating various server-side LMs compared to
performing ASR on-device only. We also perform a comparison between LMs trained
on domain data and a GPT-3 variant offered by OpenAI as a baseline.
Furthermore, we also show that model fusion of multiple server-side LMs trained
from scratch most effectively combines complementary strengths of each model
and integrates knowledge learned from domain-specific data to a VA ASR system